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The modelling and analysis of biological systems has deep roots in Mathematics, specifically in 
the field of ordinary differential equations (ODEs). Alternative approaches based on formal calculi, 
often derived from process algebras or term rewriting systems, provide a quite complementary way 
to analyze the behaviour of biological systems. These calculi allow to cope in a natural way with 
notions like compartments and membranes, which are not easy (sometimes impossible) to handle 
with purely numerical approaches, and are often based on stochastic simulation methods. Recently, 
it has also become evident that stochastic effects in regulatory networks play a crucial role in the 
analysis of such systems. Actually, in many situations it is necessary to use stochastic models. For 
example when the system to be described is based on the interaction of few molecules, when we 
are at the presence of a chemical instability, or when we want to simulate the functioning of a pool 
of entities whose compartmentalised structure evolves dynamically. In contrast, stable metabolic 
networks, involving a large number of reagents, for which the computational cost of a stochastic 
simulation becomes an insurmountable obstacle, are efficiently modelled with ODEs. In this paper 
we define a hybrid simulation method, combining the stochastic approach with ODEs, for systems 
described in CWC, a calculus on which we can express the compartmentalisation of a biological 
system whose evolution is defined by a set of rewrite rules. 

1 Introduction 

The most common approach of biologists to describe biological systems is based on the use of deter- 
ministic mathematical means like, e.g., ordinary differential equations (ODEs for short). ODEs make 
it possible to abstractly reason on the behaviour of biological systems and to perform a quantitative 
in silico investigation. This kind of modelling, however, becomes more and more difficult, both in 
the specification phase and in the analysis processes, when the complexity of the biological systems 
taken into consideration increases. This has probably been one of the main motivations for investigat- 
ing the description of biological systems by means of formalisms developed in Computer Science for 
the description of computational entities |2TI . Different formalisms have either been applied to (or 
have been inspired from) biological systems. Automata-based models [2, 16] have the advantage of al- 
lowing the direct use of many verification tools such as model checkers. Rewrite systems |[T0l |20l [4j 
usually allow describing biological systems with a notation that can be easily understood by biologists. 
Both automata-like models and rewrite systems present, in general, problems from the point of view of 
compositionality, which allows studying the behaviour of a system componentwise. Compostionality, 
instead, is in general ensured by Process calculi, included those commonly used to describe biologi- 
cal systems ||2T1 [T9l l6l. Quantitative simulations of biological models represented with these kind of 
frameworks (see, e.g. U9J \TT\ [15] |3j [I2j [8]) are usually developed via a stochastic method derived by 
Gillespie's algorithm fl3l . 
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The ODE description of biological systems determines continuous, deterministic models in which 
variables describe the concentrations of the species involved in the system as functions of the time. 
These models are based on average reaction rates, measured from real experiments which relate to the 
change of concentrations over time, taking into account the known properties of the involved chemicals, 
but possibly abstracting away some unknown mechanisms. Given the reaction equations (together with 
their rates) and the initial amount for each species, an ODEs model can be constructed by writing a 
differential equation for each biochemical specie whose concentration changes over time. 

In contrast to the deterministic model, discrete, stochastic simulations involve random variables. 
Therefore, the behaviour of a reaction is not determined a priori but characterized statistically. Since 
biological reactions fall in the category of stochastic systems (the very basic steps of every molecu- 
lar reaction can be described only in terms of its probability of occurrence), stochastic kinetic models 
are increasingly accepted as the best way to represent and simulate genetic and biochemical networks. 
Moreover, when the system to be described is based on the interaction of few molecules, or we want to 
simulate the functioning of a little pool of cells it is necessary to use stochastic models. 

The stochastic approach is always valid when the deterministic one is, and it may be valid when the 
ordinary deterministic is not (i.e. in a nonlinear system in the neighborhood of a chemical instability). 
Actually, in the last years it has become evident that stochastic effects in regulatory networks play a 
crucial role in the analysis of such systems (for example in case of multi-stable systems). In contrast, 
metabolic networks involving large numbers of molecules are most often modelled deterministically. 
Thus, because of the bimodal nature of biological systems, it may happen that a purely deterministic 
model does not accurately capture the dynamics of the considered system, and a stochastic description 
is needed. However, the computational cost of a discrete simulation often becomes an insurmountable 
obstacle. Computationally, the ODEs method is extremely more efficient. Thus, when the deterministic 
approach is applicable, it might be profitable to take advantage of its efficiency, and move to the stochastic 
approach when it is not. In a hybrid model, some reactions are modelled in a discrete way (i.e. computed, 
probabilistically, according to a stochastic method) and others in a continuous way (i.e. computed, in a 
deterministic way, by a set of ODEs). 

Hybrid models for the simulation of biological systems have been presented in the last few years for 
purely mathematical models |[22l[T4l l9ll. In this paper we adapt the hybrid simulation technique within 
the programming language approach to describe and analyse the dynamics of biological systems. 

In [8] we proposed the Calculus of Wrapped Compartments (CWC for short), a simplification of the 
Calculus of Looping Sequences (CLS for short) HO. Starting from an alphabet of atomic elements, 
CWC terms are defined as multisets of elements and compartments. Elements can be localized by 
compartmentalisation and the structure of a compartment can be specified by detailing the elements of 
interest on its membrane. The evolution of the system is driven by a set of rewrite rules modelling the 
reactions of interest. We provided CWC with a stochastic operational semantics from which a continuous 
time Markov chain can be build following the standard Gillespie's approach [ 13]. 

In this paper we define a hybrid simulation method for systems described in CWC; thus: (1) we 
are able to simulate systems with compartments, (2) we use the stochastic simulation method when the 
deterministic one is not valid, and (3) we exploit the efficiency of the deterministic approach whenever 
it is applicable. 

Summary. Section [2] introduces the CWC formalism. Section [3] recalls the stochastic and the deter- 
ministic simulation methods. Section[4]introduces the hybrid simulation technique and Section [5] applies 
it to the analysis of the HIV-1 transactivation mechanism. Finally, in Section[6j we draw our conclusions. 
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Simple terms syntax 

t ::= a (a\t) e 

Structural congruence 

t U W V = t W U V 

if a = b and t = u then (a\i) =(b\u) 

Figure 1: CWC term syntax and structural congruence rules 

2 The Calculus of Wrapped Compartments 

Like most modelling languages based on term rewriting (notably CLS), a CWC (biological) model con- 
sists of a term, representing the system and a set of rewrite rules which model the transformations de- 
termining the system's evolution. The calculus presented here is a slight variant of the one introduced 
in JS). Namely, compartments are enriched with a nominal type which identifies the set of rewrite rules 
that can be applied on that compartment. 

Terms and Structural Congruence A term of the CWC calculus is intended to represent a biological 
system. A term is a multiset of simple terms. Simple terms, ranged over by t, u, v, w, are built by 
means of the compartment constructor, (— J — )~, from a set srf of atomic elements {atoms for short), 
ranged over by a, b, c, d, and from a set Jz? of compartments types (represented as labels attached to 
compartments and rules), ranged over by £,£',£i,. . .. The syntax of simple terms is given at the top of 
Figure [I] We write t to denote a (possibly empty) multiset of simple terms t\ ■ ■ -t n . Similarly, with a 
we denote a (possibly empty) multiset of atoms. The set of simple terms will be denoted by ST. The 
set of terms (multiset of simple terms) and the set of multisets of atoms will be denoted by ,^ and sd ', 
respectively. Note that srf C 2? . 

A term i = t\ ■ ■ • t„ should be understood as the multiset containing the simple terms t\,...,t n . There- 
fore, we introduce a relation of structural congruence, following a standard approach in process algebra. 
The CWC structural congruence is the least equivalence relation on terms satisfying the rules given at 
the bottom of Figure[T] From now on we will always consider terms modulo structural congruence. Then 
a simple term is either an atom or a compartment (a J t) f: consisting of a wrap (represented by the multiset 
of atoms a), a content (represented by the term t) and a type (represented by the label £). We write the 
empty multiset as • and denote the union of two multisets u and v as u v. Let's extend the notion of 
subset (denoted as usual as C) between terms interpreted as multisets. 

An example of term ist = ab (c d \e fy representing a multiset consisting of two atoms a and b 
(for instance two molecules) and an ^-type compartment (c d J e f) 1 which, in turn, consists of a wrap 
(a membrane) with two atoms c and d (for instance, two proteins) on its surface, and containing the 
atoms e (for instance, a molecule) and / (for instance a DNA strand). See Figure [2] for some graphical 
representations. 

Rewrite Rules, Variables, Open Terms and Patterns A rewrite rule is defined as a pair of terms 
(possibly containing variables), which represent the patterns defining the system transformations, to- 
gether with a label I representing the compartment type on which the rule can be applied. Compartments 
are identified by the notion of (labelled) reduction context introduced below. A rule is applicable in a 
compartment if its content matches the left-hand side of the rule via a proper instantiation of its variables 
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Figure 2: (i) represents (a b c\ •) ; (ii) represents (a 
(abc\(de\*y fgY 



(iii) 

b c\ (d e\ •Y'Y; (iii) represents 



(note this instantiation is in general not unique). A system transformation is obtained by replacing the 
reduced subterm by the corresponding instance of the right-hand side of the rule. 

In order to formally define the rewriting semantics, we introduce the notion of open term (a term 
containing variables) and pattern (an open term that may be used as left part of a rewrite rule). In 
order to respect the syntax of terms, we distinguish between "wrap variables" which may occur only in 
compartment wraps (and can be replaced only by multisets of atoms) and "term variables" which may 
only occur in compartment contents or at top level (and can be replaced by arbitrary terms). Therefore, 
we assume a set of term variables, Y-^, ranged over by X, Y,Z, and a set of wrap variables, ranged 
over by x, y, z- These two sets are disjoint. We denote by V the set of all variables U 1^, and with p 
any variable in V. 

• Open terms are terms which may contain occurrences of wrap variables in compartment wraps and 
term variables in compartment contents or at top level. They can be seen as multisets of simple 
open terms. More formally, open terms, ranged over by O and simple open terms, ranged over by 
o, are defined in the following way: 

O ::= o 

o ::= a • X • (a x\ o~Y 

We denote with 6 the set of open terms. An open term is linear if each variable occurs at most 
once. 

• Patterns, ranged over by P, and simple patterns, ranged over by p, are the linear open terms defined 
in the following way: 

P ■■= p 

p ::= t (ax\pXY 
We denote with £P the set of patterns. 

An instantiation is a partial function a : "V — > ST . An instantiation must preserve the type of variables, 
thus for X G ^ and xei^we have o(X) G and a(x) G si, respectively. Given O G 6, with Oo 
we denote the term obtained by replacing each occurrence of each variable pef appearing in O with 
the corresponding term cr(p). 

Let £ denote the set of all the possible instantiations and Var(O) denote the set of variables appearing 
in O G e. 

A rewrite rule is a triple (£,P, O), also denoted by I : Pi — >0, where P G 2? and 06^ are such that 
Var(0) C Var(P). The label £ denotes the type of the compartments where the rule can be applied. A 
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rewrite rule £ : Pi — > O then states that a subterm Po, obtained by instantiating variables in P by some 
instantiation function a, can be transformed into the subterm Oo within any compartment of type I. We 
use the special label T G Jz? to denote the type of the top level of a term. 

Contexts The definition of reduction for CWC systems is completed by resorting to the notion of 
reduction context. To this aim, the syntax of terms is enriched with a new element □ representing a hole. 
Reduction context (ranged over by C) are defined by: 

C ::= □ | Ct | (a\C) e 

where a € £/, t £ 2F and I £ Jzf '. We denote with ^ the infinite set of contexts. 

By definition, every context contains a single hole □. Let us assume C,C G c €. With C[t] we denote 
the term obtained by replacing □ with t in C; with C[C'\ we denote context composition, whose result is 
the context obtained by replacing □ with C in C. For example, given c = (a b\ ny i, a = (c d\ U) e g h 
and t = e f, we get C[C'[t}} = (ab\(c d\e ff g h) e i. 

In order to apply a rule within a compartment of the correct type we define a function that, given a 
context, returns the label of the innermost compartment containing the hole. If the hole appears at top 
level, the distinguished label T is returned. The function Lab is defined as follows: 

(r s_{ T if C = □ ? 

LAB(C)- j t ifC = C'[(a\Di) 1 ] 

Qualitative Reduction Semantics A CWC system over a set srf of atoms and a set of labels is 
represented by a set £2^^ (=2 for short when stf and Jz? are understood) of rewrite rules over and Jzf '. 

The qualitative reduction semantics of a CWC system J2 is the least transition relation satisfying the 
following rule: 

t-.P\ — >og£ aei ce? lab(c)=£ 

C[Po] -> C[Oo] 

Modelling Guidelines In this section we give some explanations and general hints about how CWC 
could be used to represent the behaviour of various biological systems. Here, entities are represented by 
terms of the rewrite system, and events by rewrite rules. 

First of all, we should select the biomolecular entities of interest. Since we want to describe cells, 
we consider molecular populations and membranes. Molecular populations are groups of molecules that 
are in the same compartment of the cells and inside them. As we have said before, molecules can be of 
many types: we classify them as proteins, chemical moieties and other molecules. 

Membranes are considered as elementary objects: we do not describe them at the level of the phos- 
pholipids they are made of. The only interesting properties of a membrane are that it may have a con- 
tent (hence, create a compartment) and that in its phospholipid bilayer various proteins are embedded, 
which act for example as transporters and receptors. Since membranes are represented as multisets of 
the embedded structures, we are modeling a fluid mosaic in which the membranes become similar to a 
two-dimensional liquid where molecules can diffuse more or less freely ||23l . 

Compartment labels are useful to identify the kind of a compartment. For example, we may use 
compartment labels to denote a nucleus within a cell, the different organelles, etc.. 
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Biomolecular Event 

State change 

Complexation 

Decomplexation 

State change on membrane 

Complexation 

on membrane 

Decomplexation 

on membrane 

Membrane crossing 



Catalyzed 
membrane crossing 
Membrane joining 



Catalyzed 
membrane joining 



CWC Rewrite Rules 

ai — >b 
a b i — > c 
c i — > a b 

(a x\X) i — > (b x\X) 
a {bx\X)\ — > (cx\X) ~ 
(b x\aX)< — > \cx\X) 
(cx\X)< — >a (b x\X) 
(cx\X)< — >(bx\aX) 
a (xJX) i — > \x\aX) 
fx J a JSC") i — > a (x J X) 
a (b xj X) i — > (b xj a X) 
(bx\aX)\ — >a (b x\X) 
a (x\X) i — > (a x\X) 
(x\aX)< — >(ax\X) 
a (b xj X) i — > (a b xj X) 
(bx\aX)\ — >(abx\X) 
(x\abX)\ — >{ax\bX) 



Table 1: Guidelines for modelling biomolecular events in CWC 



Table [T] lists the guidelines (taken from HI) for the abstraction into CWC rules of some basic 
biomolecular events, some of which will be used in our applications [j] Entities are associated with CWC 
terms: elementary objects (genes, domains, etc..) are modelled as atoms, molecular populations as CWC 
terms, and membranes as atom multisets. Biomolecular events are associated with CWC rewrite rules. 

The simplest kind of event is the change of state of an elementary object. Then, there are interactions 
between molecules: in particular complexation, decomplexation and catalysis. Interactions could take 
place between simple molecules, depicted as single symbols, or between membranes and molecules: 
for example a molecule may cross or join a membrane. Finally, there are also interactions between 
membranes: in this case there may be many kinds of interactions (fusion, vesicle dynamics, etc. . . ). 

3 Quantitative Simulation Models for CWC 

A quantitative operational semantics for CWC can be defined by associating to the rewriting rules of 
CWC the kinetic constant k of the modeled chemical reaction. A quantitative rewrite rule is then a 

quadruple (£,P,0,k), denoted with £ : pJ^O, where £,P and O are in Section]^ and k G IR-°. 

In this section we introduce two (standard) quantitative simulation methods for CWC based respec- 
tively on Gillespie's stochastic simulation algorithm |[T3ll and on the (deterministic) solution of ordinary 
differential equations. These two approaches, that will be presented separately in this section, will be 
integrated in the next section defining the hybrid semantics of CWC. 

A prototype implementation of the hybrid CWC calculus (which encompasses both the pure stochas- 
tic and the deterministic versions of the calculus) is available HI. 

Remark 3.1 Notice that the stochastic, deterministic and hybrid approaches introduced in this section 
simulate well-stirred system of molecules, confined to a constant volume and in thermal equilibrium at 
some constant temperature. In these conditions we can describe the systems state by specifying only 

'Compartment labels are omitted for simplicity, just notice that the rules shown in the table can be specified to apply only 
within a given type of compartment. 
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0.15 


0.2 


0.15 
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0.2 


0.15 


0.2 


C 


0.15 


0.2 


0.15 



Table 2: Interaction Matrix 



the molecular populations, ignoring the positions and velocities of the individual molecules. Different 
approaches such as Molecular Dynamics, Partial Differential Equations or Lattice-based methods are 
required in case of molecular crowding, anisotropy of the medium or canalization. 

Running Example In order to illustrate the quantitative semantics of CWC we consider, as a running 
example, a toy case study based on a compartmentalized variant of a system studied in ecology to de- 
scribe a generalized competitive Lotka-Volterra dynamics E4l . The variant consists in an schema of 
ecoregions bounded by geographical frontiers rendered by compartments. The system has 3 competitive 
species in 2 environmental compartments. The supposed population dynamics depend on the following 
parameters: 

• the interaction matrix jU,j > 0, where the element ;U ; j represents the relative strength that species 
i has on the population of species j, i.e. the competition between species in the same environment 
due to incompatibility; 

• the carrying capacity K[ > is the population size of the species i that the environment can sustain 
indefinitely assuming no interaction between the species; 

• the migration rate d\ associated to each species which migrates between the compartments. 

In our tests, we set the interaction matrix in accordance to Table [2] The carrying capacities are 
Ki = 100 and the migration rates between the compartments are: c?a = 0.01, ds = 0.01, dc = —0.01. 
The migration rates are positive for species A and B moving from the compartment to the outside, and 
negative for species C which moves in the reverse path. This system has a wide set of possible behaviour, 
particularly the compartmentalization can be interpreted as the case of an ecological frontier like a river 
or a mountain which partially separates the different populations. 

The set of CWC rules adopted in our toy case study is given in Figure[3j where the rates of competi- 
tion of the species i against the species j were calculated as kjj = The two environmental compart- 
ments are represented by the implicit top level compartment (of type T) and by an explicit compartment 
of type IN. Rules (N\ —N3) model the migration of the three species between the two compartments. The 
other rules model the competition between species (rules B4 — B9) and their reproduction capacity (rules 
B\ —B3). Note that rule (fii ), is indeed a compact representation for the rules: 

(B[) T:Ah4AA (B'{) IN:A^AA 

Similarly for the rules (B2), ■ . ■ , (Bg). The simulations will be performed for 35 time units, with the 
starting termj^] 

2xC («J2xA 2xB) m . 



The notation n x a, where n is a natural number and a is an atom, denotes the multiset containing n occurrences of a. 
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(Ni) T :{x\AX) IN ^{x\X) IN A 

(N 2 ) T:(x\B X) 1N A (xj X) IN B 

(N 3 ) T: (x\X) m cA {x\CX) IN 
(fli) T,IN:A^A A 

(B 2 ) T,IN:B \—} B B 

(B 3 ) T,IN:C^C C 

(B 4 ) T,IN:AA^A-» 
(B 5 ) T,IN:B • 

(B 6 ) T,W:C C&$* 

(B 7 ) T,W:Afih4. 
(flg) T,W:AC^. 
(B 9 ) T,IN:B C^>« 

Figure 3: CWC rules for the test case 



3.1 Stochastic Evolution 

A stochastic simulation model for biological systems can be defined by incorporating a collision-based 
stochastic framework along the line of the one presented by Gillespie in IPT31 . which is, de facto, the 
standard way to model quantitative aspects of biological systems. The idea of Gillespie's algorithm is 
that a rate constant is associated with each considered chemical reaction. Such a constant is obtained by 
multiplying the kinetic constant of the reaction by the number of possible combinations of reactants that 
may occur in the system. The resulting rate is then used as the parameter of an exponential distribution 
modelling the time spent between two occurrences of the considered chemical reaction. Following the 
law of mass action, it is necessary to count the number of reactants that are present in a system in order 
to compute the exact rate of a reaction. The same approach has been applied, for instance, to define the 
quantitative semantics of the stochastic 7r-calculus |[T8l[T9l . 

The use of exponential distributions to represent the (stochastic) time spent between two occurrences 
of chemical reactions allows describing the system as a Continuous Time Markov Chain (CTMC), and 
consequently allows verifying properties of the described system analytically and by means of stochastic 
model checkers. 

The number of reactants in a reaction represented by a rewrite rule is evaluated considering the 
number of distinct occurrences, in the same context, of subterms to which the rule can be applied 
producing the same term. For instance in evaluating the application rate of the stochastic rewrite rule 

k 

R = T : a b i — > c to the term t = a a b b we must consider the number of the possible combinations 
of reactants of the form a b int. Since each occurrence of a can react with each occurrence of b, this 
number is 4. So the application rate of R is k ■ 4. 

The evaluation of the application rate of a reduction rule containing variables is more complicate 
since there can be many different ways in which variables can be instantiated to match the subterm to be 
reduced, and this must be considered to correctly evaluate the application rate. Given two terms t,u and 
a reduction rule R we can compute the number of possible applications of the rule R to the term t in a 
context C[ ] of type £, resulting in the term C[u]. We denote this number by Occ(R,C[t],C[u}), where the 
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function Occ is analogous to the one defined for SCLS in [3]. The stochastic reduction semantics of a 
CWC system J2 is the least labelled transition relation satisfying the following rule: 

R = l-pJ^Oe£ ael Cg^ Lab(C) = £ 
C[Po] k - 0cc(R - C[Pa]A °%C[Oa] 

Reductions determined by a rule R are labelled with their rates. The rate of a reduction is obtained as the 
product of the rewrite rate constant and the number of occurrences of the rule within the starting term 
(thus counting the exact number of reactants to which the rule can be applied and which produce the same 
result). The rate associated with each transition in the stochastic reduction semantics is the parameter 
of an exponential distribution that characterizes the stochastic behaviour of the activity corresponding 
to the applied rewrite rule. The stochastic semantics is essentially a Continuous Time Markov Chain 
(CTMC). Given a term t, a global time 8 and all the reductions e\,...,eM that can be applied to t, with 
rates r\,. . . ,tm such that r = YaLi r i> the standard simulation procedure that corresponds to Gillespie's 
simulation algorithm [13] consists of the following two steps: 

1 . The time 8 + T at which the next stochastic reduction will occur is randomly chosen with X expo- 
nentially distributed with parameter r; 

2. The reduction e\ that will occur at time 8 + X is randomly chosen with probability ^. 

Running Example: Stochastic Simulations We performed several stochastic simulations of the toy 
case study, showing many, different, possible evolutions. Two of these runs are shown in Figure |4| 

A characteristic of this example is that in the initial phase, mainly due to the small number of indi- 
viduals involved, the evolution of the system is strongly determined by random events that can change 
dramatically the destiny of the species. In the first experiment, on the top of Figure |4] the populations 
A and C overtake population B both inside and outside compartment IN. The second experiment, on the 
bottom of Figure [4j shows a completely different fate for the populations: namely, population B over- 
takes populations A and C inside the compartment IN while population C overtakes populations A and B 
outside the compartment. Note that the cases shown here are just two possible examples of the many dif- 
ferent destinies for the three populations. The second one, in particular, differs sensibly from the average 
behaviour appearing in the deterministic simulation which is shown in Figure [5] 

3.2 Deterministic Evolution 

The standard way to express the evolution of a biochemical system is via ODEs. We define the deter- 
ministic reduction semantics for a subset of CWC quantitative rewrite rules, that we call biochemical 
rewrite rules, expressing biochemical reactions. 

_ k — _ — 

Biochemical rewrite rules are the quantitative rewrite rules of the form I : a i — > b, where a and b are 
multisets of atomic elements. 

All the reactants of a biochemical reaction are completely specified, since both sides of the rules do 
not contain variables. Moreover, biochemical reactions are local to a single compartment. Reactions 
that invoke and/or change the structure of compartments cannot be expressed with biochemical rewrite 
rules. Actually, referring to Table [T] we notice that biochemical rewrite rules can be used to model state 
change, complexation and decomplexation: these are exactly the kinds of reactions naturally eligible to 
be simulated with ODEs. 
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Figure 4: Two different runs of the stochastic simulations showing the different behaviour of the dynam- 
ics of the competitive species inside the compartment IN (on the left side) and outside the compartment 
(on the right side) 

A CWC system J2 consisting of r biochemical rewrite rules represents a system of r biochemical 
reactions. Its deterministic semantics is defined by extracting from £? a system of ODEs to be used for 
simulating the evolution of the involved multisets of atoms. For every label I, let 

• a\ , . . . ,a m (ri£ > 1) denote the species of atoms that may occur at top level within a compartment 
of type I, and 

• denote the set of rules with label t. 
The j-th rule in the set £}(■ is denoted by 



: a. 



i = 1,2, 



For all species aj (j = 1,2, . . .,ni) let afj be the number of atoms of species aj consumed by the z-th 
rule and ot^ the number of atoms of species aj produced by the j-th rule. The nt x |^| stoichiometric 
matrix is defined by v ; j = af- — OCfjFl 

Let [a] denote the concentrations of the atoms of specie a occurring at top level in a given compart- 
ment of type I. If dt = a/j . . . a ir (r,- > 1), let [a,-] denote the product [a,-,] . . . [a,, ] of the concentrations 



Many of the a ; j, cc^j are usually 0. 
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Figure 5: Deterministic simulation of the dynamics of the competitive species inside the compartment 
IN (left figure) and outside the compartment (right figure) 

of the species occurring in dt in the considered compartment^] The evolution of the given compartment 
of type £ is modelled by the following system of ODEs: 

Computationally, ODEs are well studied and understood. They can be solved using a variety of nu- 
merical methods, from the Euler method to higher-order Runge-Kutta methods or stiff methods, many of 
which are readily available in software packages that can be easily incorporated into existing simulation 
code. 



Running Example: Deterministic Simulations To perform a deterministic simulation of the toy case 
study we have to remodel it by using only biochemical rewrite rules. This can be done by phrasing the 
compartmentalisation by using a different name for the species occurring in the compartment IN, namely: 
A IN , B IN and C IN . Then, the three non-biochemical rewrite rules (N\), (N2) and (N3) can be converted 
into the following biochemical rewrite rules: 

(B I0 )T :A' N Aa (Bu)T:B in A-B (B l2 )T : C A C m 

The conversion of the biochemical rules (Si) , . . . , (S9) is straightforward. For instance rule (Si) is 
converted into the two rules: 

(B'l ) T : A i—t-A A (B") T : A ,N i—> A m A ,N 

The converted starting term is 2 x C 2 x A IN 2 x B IN . The results of the deterministic simulation are 
shown in Figure [5] 

4 Being d, a multiset, if an element aj occurs h times in d; (for instance, for h = 2, i p = i q for some (1 < p,q < i n , p^q) its 
concentration is considered h times in [a,], i.e. we take the h-th power of [a;]. 
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Remark 3.2 The conversion of the original CWC system for the toy case studies into a CWC system 
using only the top level compartment and biochemical rewriting rules has been straightforward since the 
original system has a fixed compartment structure. Such a conversion, as well as the direct representation 
of the modelled biological system in terms of ODEs, might be quite complicate (or even impossible) in 
case of biological systems that during their evolution may change the structure of the compartments, 
maybe by creating a possibly unbounded number of new compartments. 

4 Hybrid Evolution 

The stochastic approach is based on a probabilistic simulation method that manages the evolution of exact 
integer quantities and often requires a huge computational time to complete a simulation. The ODEs 
numerical approach computes a unique deterministic and fractional evolution of the species involved in 
the system and achieves very efficient computations. In this section we combine both methods within 
CWC, defining a hybrid simulation technique. 

Given a CWC system we partition it into a set of biochemical rewrite rules 38 and a set of non- 
biochemical rewrite rules JV . Rules in J/ are always applied by using the stochastic method. Rules in 3$ 
might be applied with the ODEs approach. In general 3$ might contain both rules that model evolution of 
large numbers of molecules according to very fast reactions (whose execution is suitable to be correctly 
computed with ODE) and rules that model very slow reactions or reactions that involve a very small 
number of reagents. In the latter case it is convenient to compute the execution of the associated rule 
according to the stochastic approach. 

According to the state of the system, a rule might be dynamically interpreted either as stochastic 

or deterministic. For instance, during a simulation, it might happen that a given biochemical rewrite 

-it - 
rule £ : di 1—4- bt £ 33 is applied initially according to the stochastic semantics, since the associated 

compartment contains a very small number of reagents. After the system has evolved for some time, 

however, the concentration of the reagents involved in the rule can be substantially increased and it 

becomes convenient to model the corresponding reaction according to the deterministic approach. 

Actually, at the beginning of each simulation step we build, for each compartment in the term, a 
system of ODEs for the simulation of the biochemical rules in that compartment which (1) are sufficiently 
fast and (2) involve reagents with a sufficient concentration. For the remaining rules the evolution is 
determined by the stochastic simulation algorithm. 

In order to describe the hybrid semantics we assume that, given a CWC term t, each compartment 
of t is univocally identified by an index i . The index of the (implicit) compartment at the top level will 
be denoted by Iq. The biochemical reagents of a compartment (a J t) with index i, written BR(i), are 
expressed by the multiset of the atomic elements appearing in the top level of t. For example, given the 
term 

t = aab(c\(de\.f ff (c j f g? 

and assuming that the compartment (cj (d e\ •Y f) e has index l\, the compartment (d e\ •Y has index 
12 and the compartment (cj / g) e has index 13, we have that BR(io) = a a b, BR(ii) = /, BR(l2) = • 
andBR(i 3 )=/£. 

A basic point of our hybrid approach is the criterium to determine, at each computation stage, the 
reductions to compute in stochastic or in a deterministic way. In this paper we have chosen simply to 
put a threshold on the number of possible reagents and on the speed of the reaction, but other more 
sophisticated criteria should be investigated. 
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Let t denote the whole term and let / denote the set of compartment indexes occurring in t. 

1. For each compartment i £/: 

• Let £ be the label of l, let % = S$i and let ^ = 0. 

h 

• For each biochemical rule B, = £ : a,- i — -)■ bi G ^ let a,- = a,-, ... a,- (r; > 1) and let 
[dj l ]',..., [di r . ] 1 denote the concentrations of the species occurring in a, within the multiset 
B R ( i ) . Let Kj be the rate of the application of rule B, in i . Namely, K\ = k x ■ [a Xl ] 1 ■ . . . ■ [a; r . ] 1 . 
If K\ < <p or min{ [a^ ]',..., [ai r , ] 1 } < y remove B, from & x and put it into S? x . 

2. Considering the rules in \J ieI 5? x U JV select according to Gillespie's method and to the semantics 
given in Section [^T] a stochastic transition step C[Pa] - ° cc( - R ' C ^ p<y ^ c ^ 0a ^ > c[Oa]), where R = £ : 

Let X be the corresponding time interval. ' 

3. For each compartment i in /: 

• Let $ x denote the system of ODEs for the rules in 9> x in the compartment i as explained in 



Section 3.2 without considering, in the compartment i' where the stochastic transition step 
takes place, the active reagents appearing in the left part P of the stochastically applied rule. 
(If % = then S x = 0.) 
• Apply the system of ODEs S x to the biochemical reagents BR(i) of the compartment for a 
time duration T. 

4. Update the term t according to the right part O of the chosen stochastic rule and to the applications 
of the systems of ODEs. 



< In some rare case, it may happen that no rule in (Uie/^i) U ^ is applicable. In such cases the evolution of the system 
must be determined for some time T according to the deterministic semantics only. In our implementation we choose as t the 
maximum time calculated by Gillespie's algorithm for each of the applicable biochemical rules in UieJ-^i- 

Figure 6: Steps performed by an hybrid simulation iteration 



Given a term t to reduce, a rate thresholds and a concentration threshold i/a, each iteration of the 
hybrid reduction semantics performs the four steps listed in Figure[6j where, for every label I, the subsets 
of SB and JV containing the rules with label £ are denoted by 9St and J%, respectively. The first step 
identifies, for each compartment i G / (where / is the set of all compartment indexes occurring in t), two 
disjoint sets of biochemical rules, namely & x (to be applied deterministically) and (to be applied, 
together with the rules in JV , according to the stochastic method). The second step selects, considering 
only the rules in Uie/^i the next rule to be applied stochastically. The third step computes a 
system of ODEs S x for each compartment i £ / and applies the ODEs for the time duration selected by 
the stochastic step. The fourth step updates the terms according to the results of the simulation^] 

In general, if reactions are fast enough, the deterministic ODEs simulation approximate better the 
exact stochastic simulations. This is the idea behind the use of the threshold . The use of \\f, instead, 
allows to prevent the rounding approximation error that may derive when we are dealing with species 
at low concentrations. Combined together, the thresholds <p and y affect the level of approximation we 



5 Note that since ODEs deal with fractional quantities, a rounding operation will be needed before computing the next 
stochastic step. 
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Figure 7: Two different runs of the hybrid simulations showing the different behaviour of the dynamics 
of the competitive species inside the compartment IN (on the left side) and outside the compartment (on 
the right side) 



want to use in our simulations. Notice that with = +oo all reactions will be considered too slow and 
the simulation will be computed with the purely stochastic method. 

Running Example: Hybrid Simulations The hybrid simulations were performed by using thresholds 
<p = 60 and y = 60, which were determined as feasible thresholds to catch the initial stochastic effects 
for the multistable behaviour of the dynamic system. 

In Figure [7] we report two runs of the hybrid simulations showing two different evolutions of the 
competitive species inside the compartment IN (on the left side) and outside the compartment (on the 
right side). The hybrid method allows the synthesis of these evolutions of the system (among the many 
possible ones) by identifying, through the choice of the thresholds and i/a, the system state in which 
the populations are considered large enough to allow differential equations to take the government of 
the system evolution. In this example in fact, as it is easy to verify, while the destiny of the species 
is determined by the stochastic bootstrap, after the populations have reached the chosen threshold their 
evolution becomes regulated only by ODE transitions. The deterministic computation of the evolution 
of the system then tracks an average behaviour reducing the computational load. At this point, in fact, 
stochastic fluctuations, can be considered irrelevant for the description of the overall behaviour of the 
system. The moment in which the simulations move from the stochastic method to the deterministic one 
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can be easily captured in the graphs shown in Figure [7] 

The hybrid method provides a faster final solution with respect to the pure stochastic approach by a 
factor of 10 giving a very satisfiable qualitative analysis of its behaviour. 

5 A Real Model of Different Cellular Fate 

To assess the soundness and efficiency of our hybrid approach on a real biological problem we decided 
to apply it to a well known system where stochastic effects play a fundamental role in determining its 
development: the HIV-1 transactivation mechanism. 

After a cell has been infected, the retrotransposed DNA of the virus is integrated in the host genome 
and it begins its transcription in mRNA and then the translation to yield viral proteins; the initial speed 
of this mechanism, however, is fairly slow. The speedup of the viral production process is determined 
by a regulation system driven by the viral protein TAT: this protein is capable of binding cellular factors 
of the host to produce the pTEFb complex which in its acetylated form is able to bind to the integrated 
viral genome and speed up the transcription machinery, thus ending in more viral proteins and, therefore, 
more TAT, determining a positive loop. 

The time scale during which this loop is triggered is affected by many factors: the initial low TAT 
production and the rate of its degradation, the equilibrium between the active (acetylated) and inactive 
form of pTEFb and so on. As a consequence, the stochastic oscillations in this events are considered 
pivotal in determining when viral proteins are produced in a sufficient quantity to determine cellular 
lysis and viral spreading. Since HIV is known to stay dormant and inactive in some types of cells and 
since the time between the infection and the high viral production rate related to the active phase of AIDS 
is variable, this transactivation mechanism is of great interest. 

We decided to follow the direction taken in a previous study about this system (see [25]), in which 
an experimental setting is developed where a fluorescent protein, GFP, is the only one encoded by an 
engineered viral genome, along with TAT. In [25 ] they were able to identify different evolutions in the 
GFP level over time: cellular clones with exactly the same genome showed two different behaviour, one 
produced a high quantity of GFP (they called it "bright") and the other one with very little GFP ("off"). 
This work also reported that a purely stochastic simulation was able to individuate this bifurcation; a later 
work (see [14]) confirmed these results performing purely stochastic and mixed deterministic-stochastic 
simulations. 

Since C WC systems are able to represent compartments, we slightly modified the original set of rules 
used in these works to explicitly represent the cytoplasm and the nucleus of an infected cell; all the kinetic 
rates were maintained, the one for TAT nuclear import has been determined from the literature (see ifTTl ). 
The set of rules we adopted is given in Figure [8] where we refer to the cytoplasm as the T compartment 
while Tj is the label used for the nucleus. As regards the rules: (Bi) represents the slow basal rate 
of viral mRNA transcription; (N\) describes the mRNA export from the nucleus to the cytoplasm; (#2) 
and (B3) express the translations of this mRNA into GFP and TAT proteins, respectively; (N2) and (N3) 
represent the nuclear import and export of TAT; (B4.) and (i?5)models the binding and unbinding of TAT 
with (not represented here) host cellular factors and the viral genome portion LTR that forms pTEFb 
which, when acetylated (by rule (B(,)) determines an higher transcriptional activity, which is represented 
in (Bg) by the unbinding that releases LTR and TAT and creates an mRNA molecule (note the higher 
rate with respect to (fii)); (B7) represents the pTEFb deacetylation and (B9), (Bio) and (fin) model 
the degradation processes of the proteins and the mRNA (note that mRNA degrades both in the nucleus 
and in the cytoplasm, the other proteins only degrade in the cytoplasm; also note how the compartment 
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1 v 1 n~9 

(fli ) TJ : LTR LTR mRNA 

(Ni) T : (xj mRNA X)1 P xl0 ~? ( X J x)^ mRNA 

(B 2 ) T : mRNA H mRNA GFP 

1 ^9 v 1 0~3 

(B 3 ) T : mRNA i > m/WA TAT 

(iV 2 ) T : (xj X)^ TAT f- 5xl °~) (xj TAT X)^ 

(N 3 ) T : (xj TATX) 7 ! l 2xW ^ ( x j TAT 

(5 4 ) tj : 7AJ LTO i ) ^r£"Fft 

(B 5 ) 7] : ^reFft I ) TAT 

(B 6 ) tj : pTEFb t^~^ pTEFbuac 

(B 7 ) T7 : pTEFb-dc ^> pTOFft 

(fl 8 ) T7 : pTEFb-ac H LTR TAT mRNA 

(B 9 ) T:GFP ? mxW 1 . 

(B l0 ) T:TATt^l. 

t-n \ -r niTi 4.8xlCT 5 

(Bn) T, tj : mRNA i >■ • 



Figure 8: CWC rules for the TAT transactivation system 



labelling mechanism allows to express this fact in a simple and elegant way). 

We performed 100 purely stochastic simulations (i.e. setting = +oo) and 100 hybrid simulations 
(using = 0.5 and y = 10). The values of and y for the hybrid simulation were determined, after a 
few experiments, as the right values to be able to grasp the stochastic effects for this system. The initial 
term of our simulations is represented by the CWC term 

75000 x GFP 5 x TAT (•\LTR) ri , 

while the time interval of our simulations has been fixed to 10 6 seconds (the same parameters are used 
in E51 [T4l ). Both our stochastic and hybrid simulations clearly showed the two possible evolutions 
of the system which correspond to the "bright" and the "off" cellular populations (in order to display 
the double destiny, almost all the biochemical rewrite rules have to be simulated with the stochastic 
approach). As could be seen in Figure [9} the hybrid simulations are comparable to the purely stochastic 
ones and, even with the relatively high thresholds used in this particular case, the hybrid simulations 
were computationally more efficient (almost 40% faster)Jj 



6 Conclusions 

As we have seen, CWC allows to model cellular interaction, localisation and membrane structures. Other 
formalisms were developed to describe membrane systems. Among them we cite Brane Calculi (6[ and 
P-Sy stems BUI . 

CWC can describe situations that cannot be easily captured by the previously mentioned formalisms, 
which consider membranes as atomic objects (extensions of P-Systems with objects on membranes can 

^Comparisons are made using the same stochastic engine, in both cases with no particular optimization. 
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Figure 9: Two different simulations pure stochastic (on the left) and hybrid (on the right) started with the 
same parameters: "bright" and "off" behaviour 

be found in ||5] 13). Representing the membrane structure as a multiset of the elements of interest al- 
lows the definition of different functionalities depending on the type and the number of elements on the 
membrane itself. 

In this paper we have defined a hybrid simulation technique for systems described in CWC, which 
combines the stochastic approach with the deterministic one obtained through ODEs. The method alter- 
nates discrete transitions, computed probabilistically according to the stochastic method, and continuous 
transitions, computed in a deterministic way by a set of ODEs. Our technique turns out to accurately 
capture the dynamics of systems that exhibit stochastic effects and takes advantage, whenever the deter- 
ministic approach is applicable, of the efficiency of the ODEs simulation method. 

The running examples used to make comparisons between the different quantitative simulation meth- 
ods, and the HIV-1 transactivation mechanism are challenging tests for our hybrid methodology. On the 
one hand, the running example shows that, from the methodological point of view, several runs of hybrid 
simulations on a model of dynamic competitive populations allow to "synthesize" a set of stochastic 
experiments avoiding the statistical assumptions about the initial distributions of the parameters that are 
needed in a purely deterministic or purely stochastic analysis. On the other hand, the simulation of the 
HIV-1 transactivation mechanism follows a simulation which is almost purely stochastic: only a few 
rules pass the threshold condition, thus the computational gain of the deterministic approach is, in this 
particular case, very limited (even if still sensible). 

Compartment labels introduced in this paper are a novelty with respect to the original CWC calculus 
presented in [[8). As we have seen, these labels are necessary when building a system of ODEs for 
a compartment of type I. However, we might exploit these labels as an intrinsical information about 
the properties of a compartment. For example, assuming that compartments of the same type have 
approximatively the same volume, we might use the compartment type to define a set of biochemical 
rules whose kinetics incorporate the information about the volume of the compartment on which the 
rule could be applied. Suppose, in practice, to analyse a system in which two different kind of cells 
may interact. Let's call l\ and 1% the compartment types of the two kinds of cells. Suppose, then, that 
particles a and b are free to float between these cells and the top level interspace hosting all the cells. 
Finally, particles a and b may interact by complexation and produce the particle c. If it holds that the top 
level interspace on which the different cells float has around lOOx the volume of a cell of type £\ and if 
a cell of type l\ has around 3x the volume of a cell of type £2, we can express the different speeds of the 
a — b complexation in the different compartments (according to their volumes) with the three following 
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rules: 

-,- , k „ , MOO „ , k-300 

I :a#i — )• c, l\\ab\ — >c, 12 '■ a b 1 — >c. 

Actually, it is crucial to consider in detail the volumes of the involved compartments and to con- 
sider adequate kinetics for the biochemical rules used to simulate the system behaviour. We notice, in 
particular, that the approach based on ODEs directly translates chemical reactions into mathematical 
equations and computes the concentrations over time of the involved species (usually the molar con- 
centration, which denotes the number of moles of a given substance per liter). Models based on the 
stochastic approach, instead, simulate the activity of each single individual involved in the evolution of 
the system. Such a delicate difference between the two methods should be carefully taken into account 
when developing the set of rules to be simulated with the hybrid approach. The version of CWC with la- 
belled compartments presented in this paper simplifies this kind of analysis and allows for more accurate 
simulations. 
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